Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection
نویسندگان
چکیده
Integrating multispectral data in object detection, especially visible and infrared images, has received great attention recent years. Since (RGB) (IR) images can provide complementary information to handle light variations, the paired are used many fields, such as pedestrian RGB-IR crowd counting salient detection. Compared with natural we find detection aerial suffers from cross-modal weakly misalignment problems, which manifested position, size angle deviations of same object. In this paper, mainly address challenge images. Specifically, firstly explain analyze cause problem. Then, propose a Translation-Scale-Rotation Alignment (TSRA) module problem by calibrating feature maps these two modalities. The predicts deviation between modality objects through an alignment process utilizes Modality-Selection (MS) strategy improve performance alignment. Finally, two-stream detector (TSFADet) based on TSRA is constructed for With comprehensive experiments public DroneVehicle datasets, verify that our method reduces effect achieve robust results.
منابع مشابه
Translation , Rotation and Scale Invariant Object
A method for object recognition invariant under translation , rotation and scaling is addressed. The rst step of the method (preprocessing) takes into account the invariant properties of the normalized moment of inertia and a novel coding that extracts topological object characteristics. The second step (recognition) is achieved by using a Holographic Nearest Neighbor algorithm (HNN), where vec...
متن کاملROTATION , SCALE AND TRANSLATION INVARIANT DIGITAL IMAGEWATERMARKINGJoseph
A digital watermark is an invisible mark embedded in a digital image which may be used for Copyright Protection. This paper describes how Fourier-Mellin transform-based invariants can be used for digital image watermarking. The embedded marks are designed to be unaaected by any comb ination of rotation, scale and translation transformations. The original image is not required for extracting the...
متن کاملRotation , Scale and Translation Invariant Digital
A digital watermark is an invisible mark embedded in a digital image which may be used for Copyright Protection. This paper describes how Fourier-Mellin transform-based invariants can be used for digital image watermarking. The embedded marks are designed to be unaaected by any combination of rotation, scale and translation transformations. The original image is not required for extracting the ...
متن کاملTranslation, rotation, and scale-invariant object recognition
A method for object recognition, invariant under translation, rotation, and scaling, is addressed. The first step of the method (preprocessing) takes into account the invariant properties of the normalized moment of inertia and a novel coding that extracts topological object characteristics. The second step (recognition) is achieved by using a holographic nearest-neighbor algorithm (HNN), in wh...
متن کاملRGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
In this work, we propose to utilize Convolutional Neural Networks (CNNs) to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the " data-hungry " nature of CNNs and the unavailability of suff...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2022
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-20077-9_30